Item Imputation with the Discrete Edit System
نویسندگان
چکیده
The Fellegi-Holt algorithm (Fellegi and Holt 1976) provides a framework for item imputation by identifying for each record with one or more edit failures a minimal set of fields that must be imputed in order to satisfy a cohesive set of edits. The set of fields is minimal in the sense that there exists at least one joint value for the fields in the set such that, when this joint value is substituted, it results in a record with no edit failure and there does not exist a smaller set of fields that also provides an edit solution. In the paper, we develop an imputation strategy based on a framework similar to FellegiHolt. The input of the imputation process consists of a quasi-minimal set of fields for each record that fails the edits, in addition to distributional information on the fields of the records not failing the edits. The goal of the paper is to show how DISCRETE, an edit-impute system developed at the Census Bureau, process this distributional information to retrieve joint values for a set of fields in such a way that: first, these joint values resolve all the edit conflicts, and second the solution provided is optimal relative to a decision rule based on a likelihood function. In other words, we show how DISCRETE seeks to resolve edit conflicts with a solution that is minimal in the Fellegi-Holt sense, but also with a solution that is probable in reference to a likelihood function.
منابع مشابه
General Methods and Algorithms for Modeling and Imputing Discrete Data under a Variety of Constraints
Loglinear modeling methods have become quite straightforward to apply to discrete data X. The models for missing data involve minor extensions of hot-deck methods (Little and Rubin 2002). Edits are structural zeros that forbid certain patterns. Winkler (2003) provided the theory for connecting edit with imputation. In this paper, we give methods and algorithms for modeling/edit/imputation under...
متن کاملA Comparison Study of ACS If-Then-Else, NIM, and DISCRETE Edit and Imputation Systems Using ACS Data
In any statistical surveys, the information gathered may contain inconsistent, incorrect, or missing data. These erroneous data need to be revised or lled in prior to data tabulations and retrieval. The revisions of the erroneous data should not a ect the statistical inferences of the data. The missing data, as well as some inconsistent or incorrect data, are easy to identify while others are n...
متن کاملAn Empirical Comparison of Performance of the Unified Approach to Linearization of Variance Estimation after Imputation with Some Other Methods
Imputation is one of the most common methods to reduce item non_response effects. Imputation results in a complete data set, and then it is possible to use naϊve estimators. After using most of common imputation methods, mean and total (imputation estimators) are still unbiased. However their variances (imputation variances) are underestimated by naϊve variance estimators. Sampling mechanism an...
متن کاملCANCEIS Experiments of Edit and Imputation with 2006 Census Test Data
In this report, we demonstrate the CANCEIS (CANadian Census Edit and Imputation System) experiments of edit and imputation with the 2006 test data. The major effort is to translate the if-then-else rules of current edit and imputation system of the decennial census into the decision logic tables (DLT) of CANCEIS. We also formulate the input files that are needed to run the CANCEIS software. The...
متن کاملDeveloping Imputation Models for the Services Sectors Portion of the Economic Census
The editing software used by the Economic Census offers a variety of imputation options, many of which employ statistical models. In prior censuses, the services sectors portion of the Economic Census has relied on industry average imputation as its primary statistical imputation model. This ratio imputation method uses weighted least squares estimates for no-intercept simple linear regression ...
متن کامل